Ad Hoc Data and the Token Ambiguity Problem
نویسندگان
چکیده
PADS is a declarative language used to describe the syntax and semantic properties of ad hoc data sources such as financial transactions, server logs and scientific data sets. The PADS compiler reads these descriptions and generates a suite of useful data processing tools such as format translators, parsers, printers and even a query engine, all customized to the ad hoc data format in question. Recently, however, to further improve the productivity of programmers that manage ad hoc data sources, we have turned to using PADS as an intermediate language in a system that first infers a PADS description directly from example data and then passes that description to the original compiler for tool generation. A key subproblem in the inference engine is the token ambiguity problem — the problem of determining which substrings in the example data correspond to complex tokens such as dates, URLs, or comments. In order to solve the token ambiguity problem, the paper studies the relative effectiveness of three different statistical models for tokenizing ad hoc data. It also shows how to incorporate these models into a general and effective format inference algorithm. In addition to using a declarative language (PADS) as a key intermediate form, we have implemented the system as a whole in ML.
منابع مشابه
T-MAH: A Token Passing MAC protocol for Ad Hoc Networks
The Token Passing MAC protocol for Ad Hoc networks (T-MAH), discussed in this paper, is a distributed medium access protocol designed for wireless multi-hop networks. With T-MAH access scheme the network is organized in clusters (called Token Groups) with a Token Group Head as the leader of the group. In each single cluster is used a token based technique, i.e. each node in the cluster is allow...
متن کاملA new solution to the h-out of-k problem in mobile ad hoc networks
In this paper, we describe a new token based h-out of-k mutual exclusion solution for mobile ad hoc networks. This protocol does neither use the routing layer nor a logical structure and agrees requests based on their distances away to the token, their olds, and there resources number. A request is sent on the routes of the nodes for which a request is present in the local queue, with a dynamic...
متن کاملBroadcast Routing in Wireless Ad-Hoc Networks: A Particle Swarm optimization Approach
While routing in multi-hop packet radio networks (static Ad-hoc wireless networks), it is crucial to minimize power consumption since nodes are powered by batteries of limited capacity and it is expensive to recharge the device. This paper studies the problem of broadcast routing in radio networks. Given a network with an identified source node, any broadcast routing is considered as a directed...
متن کاملA Group Mutual Exclusion Algorithm for Ad Hoc Mobile Networks
In this paper, we propose a token based algorithm to solve the group mutual exclusion (GME) problem for ad hoc mobile networks. The proposed algorithm is adapted from the RL algorithm in [WWV98] and utilizes the concept of weight throwing in [Tse95]. We prove that the proposed algorithm satisfies the mutual exclusion, the bounded delay, and the concurrent entering properties. The proposed algor...
متن کاملEnergy Efficient Routing in Mobile Ad Hoc Networks by Using Honey Bee Mating Optimization
Mobile Ad hoc networks (MANETs) are composed of mobile stations communicating through wireless links, without any fixed backbone support. In these networks, limited power energy supply, and frequent topology changes caused by node mobility, makes their routing a challenging problem. TORA is one of the routing protocols that successfully copes with the nodes’ mobility side effects, but it do...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009